pacman::p_load(olsrr, ggstatsplot, corrplot, ggpubr, sf, spdep, GWmodel, tmap, tidyverse, gtsummary, performance, see, sfdep)Take Home Exercise 3a: Modelling Geography of Financial Inclusion with Geographically Weighted Methods
1. Introduction
According to Wikipedia, financial inclusion is the availability and equality of opportunities to access financial services. It refers to processes by which individuals and businesses can access appropriate, affordable, and timely financial products and services - which include banking, loan, equity, and insurance products. It provides paths to enhance inclusiveness in economic growth by enabling the unbanked population to access the means for savings, investment, and insurance towards improving household income and reducing income inequality.
2. The Task
In this take-home exercise, we are required to build an explanatory model to determine factors affecting financial inclusion by using geographical weighted regression methods.
3. The Data
For the purpose of this take-home exercise, two data sets shall be used, they are:
The district level boundary GIS data can be downloaded from geoBoundaries portal
4. Importing Packages
Before we start the exercise, we will need to import necessary R packages first. We will use the following packages:
olsrr package for building OLS and performing diagnostics tests
GWmodel package for calibrating geographical weighted family of models
corrplot package for multivariate data visualisation and analysis
sf package provides functions to manage, processing, and manipulate Simple Features, a formal geospatial data standard that specifies a storage and access model of spatial geometries such as points, lines, and polygons.
tmap which provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API.
Use the code chunk below to install and launch the below R packages.
5. Getting the Data Into R Environment
5.1 Importing geospatial data
The geospatial data used in this hands-on exercise is called geoBoundaries-UGA-ADM2. It is in ESRI shapefile format. The shapefile consists of Uganda district level boundaries. Polygon features are used to represent these geographic boundaries. The GIS data is in svy21 projected coordinates systems.
The code chunk below is used to import geoBoundaries-UGA-ADM2 shapefile by using st_read() of sf packages.
# Load district level boundary GIS data
boundaries2 <- st_read(dsn = "data/rawdata/geoBoundaries-UGA-ADM2-all",
layer = "geoBoundaries-UGA-ADM2")Reading layer `geoBoundaries-UGA-ADM2' from data source
`C:\Users\user\OneDrive - Singapore Management University\MITB\6. Geospatial Analytics and Applications\jeffleesl\ISSS626-GAA\Take-Home_Ex\Take-Home_Ex03\data\rawdata\geoBoundaries-UGA-ADM2-all'
using driver `ESRI Shapefile'
Simple feature collection with 151 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 29.56838 ymin: -1.4732 xmax: 35.02676 ymax: 4.228399
Geodetic CRS: WGS 84
5.1.1 Updating CRS Information
Uganda is located in southeast Africa between 1º S and 4º N latitude, and between 30º E and 35º E longitude.
The code chunk below updates the newly imported mpsz with the correct ESPG code (i.e. 32736 or 21096).
# Transform to the correct ESPG Code
boundaries <- st_transform(boundaries2, 32736)# Verify the newly transformed boundaries
st_crs(boundaries)Coordinate Reference System:
User input: EPSG:32736
wkt:
PROJCRS["WGS 84 / UTM zone 36S",
BASEGEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["UTM zone 36S",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",33,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",10000000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Navigation and medium accuracy spatial referencing."],
AREA["Between 30°E and 36°E, southern hemisphere between 80°S and equator, onshore and offshore. Burundi. Eswatini (Swaziland). Kenya. Malawi. Mozambique. Rwanda. South Africa. Tanzania. Uganda. Zambia. Zimbabwe."],
BBOX[-80,30,0,36]],
ID["EPSG",32736]]
st_bbox(boundaries) #view extent xmin ymin xmax ymax
117997.3 9836930.8 725449.1 10467443.7
tm_shape(boundaries) +
tm_polygons()
## Convert to multipolygon to individual polygon
boundaries_sf <- boundaries %>%
st_cast("POLYGON") %>%
mutate(area = st_area(.))Warning in st_cast.sf(., "POLYGON"): repeating attributes for all
sub-geometries for which they may not be constant
## Group by the unique name and select the largest polygon by area
boundaries_cleaned <- boundaries_sf %>%
group_by(shapeName) %>%
filter(area == max(area)) %>%
ungroup() %>%
select(-area) %>%
select(shapeName) %>%
rename(
county_name = shapeName
)tm_shape(boundaries_cleaned) +
tm_polygons()
5.2 Importing the aspatial data, FinScope Uganda
The FinScope-2023_Dataset_Final is in csv file format. The codes chunk below uses read_csv() function of readr package to import FinScope-2023_Dataset_Final into R as a tibble data frame called uganda_data.
uganda_data <- read_csv("data/rawdata/FinScope-2023_Dataset_Final.csv")Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
dat <- vroom(...)
problems(dat)
Rows: 3176 Columns: 686
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (25): HH_ID, Interview_ID, ea_name, District, Region, Subregion, Rural_...
dbl (638): ea_code, age, disabled, Pweight, Lhhid, Enum_code, InterviewDate,...
lgl (23): f3_2_14, f3_2_15, f3_3_14, f3_3_15, f3_5_14, f3_5_15, g12_2_1, g1...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
5.2.1 Variables to Consider for Financial Inclusion
Check the column names in the uganda_data to identify the right names.
colnames(uganda_data) # Displays all column names in the datasetTo determine factors affecting financial inclusion, consider including the following types of variables:
Age and Age Band
Gender
Education Level
Mobile User
Income Level
Employment Status
- Urban vs. rural status
Distance to nearest bank or financial institution from Home (Commercial Bank, SACCO and Mobile Money)
Distance to nearest ATM from Home
Financial Advice
Save Money and the channel (Commerical Nank, SACCO and Mobile Money)
Last amount saved
Borrow Money and the channel (Commerical Nank, SACCO and Mobile Money)
Last amount borrowed
Last amount sent
Last amount received
Documentation for KYC (National Identification Card, Passport, Utilities and Pay Slip)
Self Sustaining
5.2.1.1 Rename the Variables
uganda_data_rename <- uganda_data %>%
select(-c(2:7, 9, 11:17, 20, 23:28, 30:34, 36:37, 40:43, 45:63, 65, 67:90, 93:94, 96, 98:167, 169:230, 232:234, 236:238, 240:241, 243:342, 344:364, 366:384, 386:438, 440:444, 448:473, 476:674, 677:679, 681:686)) %>%
rename(
age_band = c1,
gender = c2,
education_level = c4,
employment_status = c5,
mobile_user = c7_1_1,
national_ic_doc = c8_1a,
passport_doc = c8_1d,
utilities_bill_doc = c8_1e,
pay_slip_doc = c8_1j,
self_sustaining = e1_1,
financial_advice = e3_1,
save_money = f2_1,
save_money_commercial_bank = f3_1_1,
save_money_SACCO = f3_1_4,
save_money_mobile_money = f3_1_6,
last_amt_saved = f6_1,
last_amt_borrowed = g3_3,
borrow_money_commercial_bank = g6_1_1,
borrow_money_SACCO = g6_1_5,
borrow_money_mobile_money = g6_1_8,
last_amt_sent = hpp3_2,
last_amt_received = hpp6_2,
own_insurance = j1,
distance_commerical_bank = k1_1_1,
distance_SACCOS = k1_1_7,
distance_ATM = k1_1_8,
distance_mobile_money = k1_1_9,
savings_account = kcb1_1_1,
joint_account = kcb1_1_2,
latitude = hh_gps_latitude,
longitude = hh_gps_longitude,
county_name = s1aq2b
)5.2.1.1 Clean the Variables
uganda_data_new <- uganda_data_rename %>%
filter(!is.na(longitude) & longitude != "",
!is.na(latitude) & latitude != "") %>%
replace_na(list(
save_money_commercial_bank = 2,
save_money_SACCO = 2,
save_money_mobile_money = 2,
last_amt_saved = 9,
last_amt_borrowed = 998,
borrow_money_commercial_bank = 2,
borrow_money_SACCO = 2,
borrow_money_mobile_money = 2,
last_amt_sent = 998,
last_amt_received = 998
)) %>%
mutate(across(c(savings_account, joint_account),
~ if_else(is.na(.) | . == "", 2, .)))head(uganda_data_new$longitude) #see the data in XCOORD column[1] 33.65414 33.65328 33.65403 33.65586 33.65472 33.65549
head(uganda_data_new$latitude) #see the data in YCOORD column[1] 2.677662 2.675690 2.673339 2.671343 2.671923 2.672423
Next, summary() of base R is used to display the summary statistics of uganda_data_new tibble data frame.
summary(uganda_data_new) HH_ID Region Rural_Urban age_band
Length:3176 Length:3176 Length:3176 Min. :1.000
Class :character Class :character Class :character 1st Qu.:3.000
Mode :character Mode :character Mode :character Median :4.000
Mean :3.986
3rd Qu.:5.000
Max. :7.000
gender education_level employment_status mobile_user
Min. :1.000 Min. :1.000 Min. : 1.00 Min. :1.000
1st Qu.:1.000 1st Qu.:2.000 1st Qu.: 1.00 1st Qu.:1.000
Median :2.000 Median :3.000 Median : 2.00 Median :1.000
Mean :1.552 Mean :3.169 Mean : 3.67 Mean :1.273
3rd Qu.:2.000 3rd Qu.:4.000 3rd Qu.: 5.00 3rd Qu.:2.000
Max. :2.000 Max. :9.000 Max. :99.00 Max. :2.000
national_ic_doc passport_doc utilities_bill_doc pay_slip_doc
Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000
1st Qu.:1.000 1st Qu.:2.00 1st Qu.:2.000 1st Qu.:2.000
Median :1.000 Median :2.00 Median :2.000 Median :2.000
Mean :1.171 Mean :1.96 Mean :1.926 Mean :1.964
3rd Qu.:1.000 3rd Qu.:2.00 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.00 Max. :2.000 Max. :2.000
self_sustaining financial_advice save_money save_money_commercial_bank
Min. :1.000 Min. :1.000 Min. :1.00 Min. :1.000
1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.00 1st Qu.:2.000
Median :2.000 Median :1.000 Median :1.00 Median :2.000
Mean :1.846 Mean :1.401 Mean :1.36 Mean :1.896
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.00 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.00 Max. :2.000
save_money_SACCO save_money_mobile_money last_amt_saved last_amt_borrowed
Min. :1.000 Min. :1.000 Min. :1.000 Min. : 1.0
1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.: 3.0
Median :2.000 Median :2.000 Median :3.000 Median :998.0
Mean :1.899 Mean :1.738 Mean :4.646 Mean :626.1
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:9.000 3rd Qu.:998.0
Max. :2.000 Max. :2.000 Max. :9.000 Max. :998.0
borrow_money_commercial_bank borrow_money_SACCO borrow_money_mobile_money
Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000
Median :2.000 Median :2.000 Median :2.000
Mean :1.983 Mean :1.978 Mean :1.971
3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000
last_amt_sent last_amt_received own_insurance distance_commerical_bank
Min. : 1.0 Min. : 1.0 Min. :1.000 Min. :1.000
1st Qu.: 1.0 1st Qu.: 1.0 1st Qu.:2.000 1st Qu.:2.000
Median :998.0 Median :997.0 Median :2.000 Median :4.000
Mean :582.9 Mean :510.8 Mean :1.974 Mean :3.176
3rd Qu.:998.0 3rd Qu.:998.0 3rd Qu.:2.000 3rd Qu.:4.000
Max. :998.0 Max. :998.0 Max. :2.000 Max. :4.000
distance_SACCOS distance_ATM distance_mobile_money savings_account
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:2.000
Median :2.000 Median :4.000 Median :1.000 Median :2.000
Mean :2.508 Mean :3.152 Mean :1.655 Mean :1.963
3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:2.000
Max. :4.000 Max. :4.000 Max. :4.000 Max. :2.000
joint_account latitude longitude county_name
Min. :1 Min. :-1.4128 Min. : 0.00 Length:3176
1st Qu.:2 1st Qu.: 0.2393 1st Qu.:30.99 Class :character
Median :2 Median : 0.7726 Median :32.54 Mode :character
Mean :2 Mean : 0.9945 Mean :31.53
3rd Qu.:2 3rd Qu.: 1.9143 3rd Qu.:33.52
Max. :2 Max. : 3.6876 Max. :34.96
5.2.2 Convert to Percentage and Log-Transformations
This will help to perform the geographical weighted regression methods later.
uganda_data_fin <- uganda_data_new %>%
mutate(
LOG_age_band = log(age_band),
gender_pct = gender / 3176 * 100,
LOG_education_level = log(education_level),
LOG_employment_status = log(employment_status),
mobile_user_pct = mobile_user / 3176 * 100,
national_ic_doc_pct = national_ic_doc / 3176 * 100,
passport_doc_pct = passport_doc / 3176 * 100,
utilities_bill_do_pct = utilities_bill_doc / 3176 * 100,
pay_slip_doc_pct = pay_slip_doc / 3176 * 100,
self_sustaining_pct = self_sustaining / 3176 * 100,
financial_advice_pct = financial_advice / 3176 * 100,
save_money_pct = save_money / 3176 * 100,
save_money_commercial_bank_pct = save_money_commercial_bank / 3176 * 100,
save_money_SACCO_pct = save_money_SACCO / 3176 * 100,
save_money_mobile_money_pct = save_money_mobile_money / 3176 * 100,
LOG_last_amt_saved = log(last_amt_saved),
LOG_last_amt_borrowed = log(last_amt_borrowed),
borrow_money_commercial_bank_pct = borrow_money_commercial_bank / 3176 * 100,
borrow_money_SACCO_pct = borrow_money_SACCO / 3176 * 100,
borrow_money_mobile_money_pct = borrow_money_mobile_money / 3176 * 100,
LOG_last_amt_sent = log(last_amt_sent),
LOG_last_amt_received = log(last_amt_received),
own_insurance_pct = own_insurance / 3176 * 100,
LOG_distance_commerical_bank = log(distance_commerical_bank),
LOG_distance_SACCOS = log(distance_SACCOS),
LOG_distance_ATM = log(distance_ATM),
LOG_distance_mobile_money = log(distance_mobile_money),
savings_account_pct = savings_account / 3176 * 100,
joint_account_pct = joint_account / 3176 * 100
)5.3 Converting aspatial data frame into a sf object
Currently, the uganda_data_new tibble data frame is aspatial. We will convert it to a sf object. The code chunk below converts uganda_data_new data frame into a simple feature data frame by using st_as_sf() of sf packages.
uganda_data.sf <- st_as_sf(uganda_data_fin,
coords = c("longitude", "latitude"),
crs=4326) %>%
st_transform(crs=32736) Notice that st_transform() of sf package is used to convert the coordinates from wgs84 (i.e. crs:4326) to Arc 1960 (i.e. crs=32736).
Next, head() is used to list the content of uganda_data.sf object.
head(uganda_data.sf)Simple feature collection with 6 features and 62 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 572616.1 ymin: 10295290 xmax: 572903.6 ymax: 10295980
Projected CRS: WGS 84 / UTM zone 36S
# A tibble: 6 × 63
HH_ID Region Rural_Urban age_band gender education_level employment_status
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 001001 NORTHERN Urban 4 2 6 1
2 001019 NORTHERN Urban 4 2 2 5
3 001028 NORTHERN Urban 3 2 1 5
4 001037 NORTHERN Urban 4 1 2 1
5 001040 NORTHERN Urban 4 2 3 4
6 001047 NORTHERN Urban 1 1 2 9
# ℹ 56 more variables: mobile_user <dbl>, national_ic_doc <dbl>,
# passport_doc <dbl>, utilities_bill_doc <dbl>, pay_slip_doc <dbl>,
# self_sustaining <dbl>, financial_advice <dbl>, save_money <dbl>,
# save_money_commercial_bank <dbl>, save_money_SACCO <dbl>,
# save_money_mobile_money <dbl>, last_amt_saved <dbl>,
# last_amt_borrowed <dbl>, borrow_money_commercial_bank <dbl>,
# borrow_money_SACCO <dbl>, borrow_money_mobile_money <dbl>, …
Notice that the output is in point feature data frame.
6. Exploratory Data Analysis (EDA)
Use statistical graphics functions of ggplot2 package to perform EDA
6.1 EDA using statistical graphics
Plot the distribution of accounts by using appropriate Exploratory Data Analysis (EDA) as shown in the code chunk below.
ggplot(data=uganda_data.sf, aes(x=`savings_account_pct`)) +
geom_histogram(bins=20, color="black", fill="light blue")
ggplot(data=uganda_data.sf, aes(x=`save_money_mobile_money_pct`)) +
geom_histogram(bins=20, color="#0B2130", fill="#AB88BA")
6.2 Multiple Histogram Plots distribution of variables
Draw a few multiple histograms (also known as trellis plot) by using ggarrange() of ggpubr package to analysis the variables.
LOG_age_band <- ggplot(data=uganda_data.sf, aes(x= `LOG_age_band`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
gender_pct <- ggplot(data=uganda_data.sf, aes(x= `gender_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
LOG_education_level <- ggplot(data=uganda_data.sf, aes(x= `LOG_education_level`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
LOG_employment_status <- ggplot(data=uganda_data.sf, aes(x= `LOG_employment_status`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
mobile_user_pct <- ggplot(data=uganda_data.sf, aes(x= `mobile_user_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
national_ic_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `national_ic_doc_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
passport_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `passport_doc_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
utilities_bill_do_pct <- ggplot(data=uganda_data.sf, aes(x= `utilities_bill_do_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
pay_slip_doc_pct <- ggplot(data=uganda_data.sf, aes(x= `pay_slip_doc_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
self_sustaining_pct <- ggplot(data=uganda_data.sf, aes(x= `self_sustaining_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
financial_advice_pct <- ggplot(data=uganda_data.sf, aes(x= `financial_advice_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
save_money_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
save_money_commercial_bank_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_commercial_bank_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
save_money_SACCO_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_SACCO_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
save_money_mobile_money_pct <- ggplot(data=uganda_data.sf, aes(x= `save_money_mobile_money_pct`)) +
geom_histogram(bins=20, color="black", fill="#FFC166")
ggarrange(LOG_age_band, gender_pct, LOG_education_level, LOG_employment_status,
mobile_user_pct, national_ic_doc_pct, passport_doc_pct, utilities_bill_do_pct,
pay_slip_doc_pct, self_sustaining_pct, financial_advice_pct, save_money_pct,
save_money_commercial_bank_pct, save_money_SACCO_pct, save_money_mobile_money_pct,
ncol = 3, nrow = 5)
LOG_last_amt_saved <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_saved`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_last_amt_borrowed <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_borrowed`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
borrow_money_commercial_bank_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_commercial_bank_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
borrow_money_SACCO_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_SACCO_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
borrow_money_mobile_money_pct <- ggplot(data=uganda_data.sf, aes(x= `borrow_money_mobile_money_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_last_amt_sent <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_sent`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_last_amt_received <- ggplot(data=uganda_data.sf, aes(x= `LOG_last_amt_received`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
own_insurance_pct <- ggplot(data=uganda_data.sf, aes(x= `own_insurance_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_distance_commerical_bank <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_commerical_bank`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_distance_SACCOS <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_SACCOS`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_distance_ATM <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_ATM`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
LOG_distance_mobile_money <- ggplot(data=uganda_data.sf, aes(x= `LOG_distance_mobile_money`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
savings_account_pct <- ggplot(data=uganda_data.sf, aes(x= `savings_account_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
joint_account_pct <- ggplot(data=uganda_data.sf, aes(x= `joint_account_pct`)) +
geom_histogram(bins=20, color="#0A1E0F", fill="#A9CB9F")
ggarrange(LOG_last_amt_saved, LOG_last_amt_borrowed, borrow_money_commercial_bank_pct, borrow_money_SACCO_pct,
borrow_money_mobile_money_pct, LOG_last_amt_sent, LOG_last_amt_received, own_insurance_pct,
LOG_distance_commerical_bank, LOG_distance_SACCOS, LOG_distance_ATM, LOG_distance_mobile_money,
savings_account_pct, joint_account_pct,
ncol = 4, nrow = 4)
The plots show that the majority of people lack insurance, savings, and joint accounts. Many respondents also feel they do not have enough money and would like to seek financial advice. Additionally, a large portion of those surveyed report that bank branches and ATMs are located relatively far from their homes.
7. Correlation Analysis - ggstatsplot methods
ggcorrmat(uganda_data_fin[, 36:62])
8. Hedonic Pricing Modelling in R
8.1 Simple Linear Regression Method
Build a simple linear regression model by using savings_account_pct as the dependent variable and LOG_distance_commerical_bank as the independent variable.
uganda.slr <- lm(formula=savings_account_pct ~ LOG_distance_commerical_bank, data = uganda_data.sf)summary(uganda.slr)
Call:
lm(formula = savings_account_pct ~ LOG_distance_commerical_bank,
data = uganda_data.sf)
Residuals:
Min 1Q Median 3Q Max
-0.0309296 0.0005565 0.0005565 0.0019347 0.0033128
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0596595 0.0002987 199.745 < 2e-16 ***
LOG_distance_commerical_bank 0.0019882 0.0002574 7.723 1.51e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.005854 on 3174 degrees of freedom
Multiple R-squared: 0.01845, Adjusted R-squared: 0.01814
F-statistic: 59.65 on 1 and 3174 DF, p-value: 1.509e-14
The output report reveals that the SELLING_PRICE can be explained by using the formula:
*y = 0.0596595 + 0.0019882x1*
The R-squared of 0.01845 reveals that the simple regression model built is able to explain about 1.845% of the percentage of having savings account.
Since p-value is way bigger than 0.0001, we will not reject the null hypothesis that mean is not a good estimator of percentage of having savings account.
8.2 Multiple Linear Regression Method
The code chunk below using lm() to calibrate the multiple linear regression model.
sa_mlr <- lm(formula = savings_account_pct ~ LOG_age_band + gender_pct + LOG_education_level + LOG_employment_status + mobile_user_pct + national_ic_doc_pct + passport_doc_pct + utilities_bill_do_pct + pay_slip_doc_pct + self_sustaining_pct + financial_advice_pct +save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct + save_money_mobile_money_pct + LOG_last_amt_saved + LOG_last_amt_borrowed + borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + borrow_money_mobile_money_pct + LOG_last_amt_sent + LOG_last_amt_received + LOG_distance_commerical_bank + LOG_distance_SACCOS + LOG_distance_ATM + LOG_distance_mobile_money + joint_account_pct,
data=uganda_data.sf)
summary(sa_mlr)
Call:
lm(formula = savings_account_pct ~ LOG_age_band + gender_pct +
LOG_education_level + LOG_employment_status + mobile_user_pct +
national_ic_doc_pct + passport_doc_pct + utilities_bill_do_pct +
pay_slip_doc_pct + self_sustaining_pct + financial_advice_pct +
save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct +
save_money_mobile_money_pct + LOG_last_amt_saved + LOG_last_amt_borrowed +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_last_amt_sent + LOG_last_amt_received +
LOG_distance_commerical_bank + LOG_distance_SACCOS + LOG_distance_ATM +
LOG_distance_mobile_money + joint_account_pct, data = uganda_data.sf)
Residuals:
Min 1Q Median 3Q Max
-0.032541 -0.000754 -0.000117 0.000530 0.016876
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.212e-02 1.009e-02 2.193 0.02841 *
LOG_age_band -1.489e-04 2.251e-04 -0.661 0.50854
gender_pct -6.663e-03 5.831e-03 -1.143 0.25325
LOG_education_level -1.725e-04 1.855e-04 -0.930 0.35259
LOG_employment_status 6.847e-05 1.123e-04 0.610 0.54194
mobile_user_pct -7.065e-03 7.521e-03 -0.939 0.34755
national_ic_doc_pct -1.000e-03 8.438e-03 -0.119 0.90567
passport_doc_pct 4.086e-02 1.547e-02 2.642 0.00828 **
utilities_bill_do_pct 2.092e-02 1.186e-02 1.763 0.07792 .
pay_slip_doc_pct 4.974e-02 1.600e-02 3.110 0.00189 **
self_sustaining_pct 2.396e-02 8.176e-03 2.931 0.00341 **
financial_advice_pct 2.178e-03 6.152e-03 0.354 0.72338
save_money_pct -6.590e-02 1.206e-02 -5.463 5.04e-08 ***
save_money_commercial_bank_pct 1.816e-01 1.099e-02 16.530 < 2e-16 ***
save_money_SACCO_pct 2.195e-01 1.037e-02 21.158 < 2e-16 ***
save_money_mobile_money_pct -1.070e-02 7.546e-03 -1.418 0.15632
LOG_last_amt_saved 5.387e-04 1.739e-04 3.098 0.00197 **
LOG_last_amt_borrowed 4.239e-05 3.228e-05 1.313 0.18920
borrow_money_commercial_bank_pct 8.934e-02 2.279e-02 3.920 9.03e-05 ***
borrow_money_SACCO_pct 9.115e-02 2.028e-02 4.494 7.23e-06 ***
borrow_money_mobile_money_pct 4.941e-02 1.743e-02 2.835 0.00461 **
LOG_last_amt_sent 5.258e-05 3.416e-05 1.539 0.12381
LOG_last_amt_received -2.109e-05 3.265e-05 -0.646 0.51834
LOG_distance_commerical_bank 1.280e-03 4.694e-04 2.727 0.00643 **
LOG_distance_SACCOS -4.183e-05 2.315e-04 -0.181 0.85664
LOG_distance_ATM -6.954e-04 4.648e-04 -1.496 0.13473
LOG_distance_mobile_money -4.169e-06 2.132e-04 -0.020 0.98440
joint_account_pct -6.535e-02 1.570e-01 -0.416 0.67730
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.004901 on 3148 degrees of freedom
Multiple R-squared: 0.3177, Adjusted R-squared: 0.3118
F-statistic: 54.28 on 27 and 3148 DF, p-value: < 2.2e-16
The Multiple R-squared is 0.3177, suggesting that around 31.7% of the variability in savings_account_pct is explained by the predictor variables in the model. The Adjusted R-squared is 0.3118, which accounts for the number of predictors, confirming that the model provides a modest fit.
Whereas, The F-statistic of 54.28 with a very low p-value (< 2.2e-16) indicates that the model is statistically significant overall, meaning at least one of the predictors significantly impacts the savings_account_pct.
Key Predictors
The coefficients tell us the direction and strength of each predictor’s relationship with savings_account_pct. Here are some significant predictors:
Passport Documentation (
passport_doc_pct, p = 0.00828):- Positive relationship (Estimate = 0.0409): A higher percentage of individuals with passport documentation is associated with an increase in
savings_account_pct.
- Positive relationship (Estimate = 0.0409): A higher percentage of individuals with passport documentation is associated with an increase in
Payslip Documentation (
pay_slip_doc_pct, p = 0.00189):- Positive relationship (Estimate = 0.0497): A higher percentage of individuals with pay slip documentation correlates with a higher percentage of savings accounts. This might suggest that stable income documentation positively influences savings account ownership.
Self-Sustaining Percentage (
self_sustaining_pct, p = 0.00341):- Positive relationship (Estimate = 0.0239): This suggests that communities with higher self-sustaining individuals are more likely to have savings accounts.
Saving with Commercial Banks and SACCOs:
Save Money
(save_money_pct,p = 5.04e-08, Estimate = -0.06590): has a negative coefficient, implying that higher rates of people saving money in general are associated with a lower savings account percentage, possibly indicating other informalSave Money with Commercial Bank Percentage (
save_money_commercial_bank_pct, p < 2e-16, Estimate = 0.1816): Strong positive relationship, indicating that those who save with commercial banks are highly likely to have a savings account.Save Money with SACCO Percentage (
save_money_SACCO_pct, p < 2e-16, Estimate = 0.2195): Also a strong positive relationship, reinforcing that participation in SACCOs (Savings and Credit Cooperative Organizations) is a strong indicator of savings account ownership.
Borrowing from Commercial Banks and SACCOs:
Borrow Money from Commercial Bank Percentage (
borrow_money_commercial_bank_pct, p = 9.03e-05, Estimate = 0.0893): Indicates a positive association with savings account ownership.Borrow Money from SACCO Percentage (
borrow_money_SACCO_pct, p = 7.23e-06, Estimate = 0.0912): Similar positive relationship, suggesting that access to borrowing services is linked to having savings accounts.
Distance to Commercial Bank (
LOG_distance_commercial_bank, p = 0.00643):- Positive relationship (Estimate = 0.0013): As the log-distance to commercial banks increases, there’s a small but significant increase in the savings account percentage, which might suggest limited access drives individuals to hold savings accounts if they are already banked.
Last Amount Saved (
LOG_last_amt_saved, p = 0.00197):- Positive relationship (Estimate = 0.00054): The log of the last amount saved has a slight positive effect on savings account ownership, suggesting that recent saving activity is associated with having a savings account.
Non-significant Predictors
Some predictors, such as gender_pct, LOG_education_level, and LOG_age_band, show non-significant effects (high p-values). This may suggest that demographic factors like age, gender, and education do not directly impact the likelihood of savings account ownership in this context.
Documentation (like passports and pay slips) and income stability appear to be important indicators of savings account ownership.
Engagement with formal and semi-formal financial services, such as commercial banks and SACCOs, positively influences the likelihood of having a savings account.
Access to borrowing services is also positively linked to savings account ownership, suggesting that people who have access to credit may be more financially integrated.
Distance to financial services can have a minor influence, suggesting a possible need for financial services closer to communities to further increase account ownership rates.
In conclusion, the model suggests that factors related to financial habits (saving and borrowing with formal institutions) and access to financial documentation have significant impacts on the likelihood of savings account ownership in Uganda.
9. Preparing Publication Quality Table: olsrr method
With reference to the report above, it is clear that not all the independent variables are statistically significant. We will revised the model by removing those variables which are not statistically significant.
Now, we are ready to calibrate the revised model by using the code chunk below.
sa_mlr1 <- lm(formula = savings_account_pct ~ passport_doc_pct + pay_slip_doc_pct + self_sustaining_pct + save_money_pct + save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved + borrow_money_commercial_bank_pct + borrow_money_SACCO_pct + borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data=uganda_data.sf)
ols_regress(sa_mlr1) Model Summary
------------------------------------------------------------------
R 0.560 RMSE 0.005
R-Squared 0.314 MSE 0.000
Adj. R-Squared 0.312 Coef. Var 7.928
Pred R-Squared 0.298 AIC -24754.869
MAE 0.002 SBC -24676.045
------------------------------------------------------------------
RMSE: Root Mean Square Error
MSE: Mean Square Error
MAE: Mean Absolute Error
AIC: Akaike Information Criteria
SBC: Schwarz Bayesian Criteria
ANOVA
----------------------------------------------------------------------
Sum of
Squares DF Mean Square F Sig.
----------------------------------------------------------------------
Regression 0.035 11 0.003 131.713 0.0000
Residual 0.076 3164 0.000
Total 0.111 3175
----------------------------------------------------------------------
Parameter Estimates
-------------------------------------------------------------------------------------------------------------
model Beta Std. Error Std. Beta t Sig lower upper
-------------------------------------------------------------------------------------------------------------
(Intercept) 0.017 0.002 8.139 0.000 0.013 0.021
passport_doc_pct 0.047 0.015 0.049 3.112 0.002 0.017 0.076
pay_slip_doc_pct 0.054 0.016 0.053 3.412 0.001 0.023 0.084
self_sustaining_pct 0.023 0.008 0.044 2.841 0.005 0.007 0.038
save_money_pct -0.066 0.011 -0.168 -5.772 0.000 -0.088 -0.043
save_money_commercial_bank_pct 0.183 0.011 0.299 17.371 0.000 0.163 0.204
save_money_SACCO_pct 0.219 0.010 0.353 21.723 0.000 0.200 0.239
LOG_last_amt_saved 0.001 0.000 0.097 3.502 0.000 0.000 0.001
borrow_money_commercial_bank_pct 0.095 0.023 0.066 4.208 0.000 0.051 0.140
borrow_money_SACCO_pct 0.096 0.020 0.075 4.761 0.000 0.056 0.135
borrow_money_mobile_money_pct 0.050 0.017 0.044 2.914 0.004 0.016 0.083
LOG_distance_commerical_bank 0.001 0.000 0.044 2.937 0.003 0.000 0.001
-------------------------------------------------------------------------------------------------------------
After added the independent variables which are statistically significant, there are no improvement in the R-Squared and Adjusted R-Squared. Furthermore, Predicted R-Squared (of 0.298) indicating that the model may perform slightly less effectively on unseen data. Both RMSE (of 0.005) and MAE (of 0.002) giving an idea of the typical error magnitude. However, AIC and SIC show lower values indicates a better fit given model complexity.
The regression analysis suggests that several factors significantly influence the dependent variable, with certain predictors like save_money_SACCO_pct and save_money_commercial_bank_pct having particularly strong positive effects, while save_money_pct has a negative impact. The overall model explains a moderate portion of variance in the dependent variable and is statistically significant, but there may be additional variables not included in this analysis that could improve predictive power further.
10. Check for Multicolinearuty
ols_vif_tol(sa_mlr1) Variables Tolerance VIF
1 passport_doc_pct 0.8771595 1.140043
2 pay_slip_doc_pct 0.8958817 1.116219
3 self_sustaining_pct 0.9168856 1.090649
4 save_money_pct 0.2562338 3.902686
5 save_money_commercial_bank_pct 0.7324681 1.365247
6 save_money_SACCO_pct 0.8216329 1.217089
7 LOG_last_amt_saved 0.2807517 3.561866
8 borrow_money_commercial_bank_pct 0.8723239 1.146363
9 borrow_money_SACCO_pct 0.8660948 1.154608
10 borrow_money_mobile_money_pct 0.9345005 1.070090
11 LOG_distance_commerical_bank 0.9452141 1.057961
All between 1 and 5 suggests moderate correlation, which may not be problematic.
ols_vif_tol(sa_mlr) Variables Tolerance VIF
1 LOG_age_band 0.7282275 1.373197
2 gender_pct 0.9069884 1.102550
3 LOG_education_level 0.6363121 1.571556
4 LOG_employment_status 0.8117636 1.231886
5 mobile_user_pct 0.6799937 1.470602
6 national_ic_doc_pct 0.7546980 1.325033
7 passport_doc_pct 0.8306371 1.203895
8 utilities_bill_do_pct 0.7943377 1.258910
9 pay_slip_doc_pct 0.8613739 1.160936
10 self_sustaining_pct 0.8773222 1.139832
11 financial_advice_pct 0.8385705 1.192506
12 save_money_pct 0.2275655 4.394339
13 save_money_commercial_bank_pct 0.6768241 1.477489
14 save_money_SACCO_pct 0.7779627 1.285409
15 save_money_mobile_money_pct 0.6927860 1.443447
16 LOG_last_amt_saved 0.2614038 3.825499
17 LOG_last_amt_borrowed 0.7658660 1.305711
18 borrow_money_commercial_bank_pct 0.8631940 1.158488
19 borrow_money_SACCO_pct 0.8483674 1.178735
20 borrow_money_mobile_money_pct 0.8926030 1.120319
21 LOG_last_amt_sent 0.6060137 1.650128
22 LOG_last_amt_received 0.6633877 1.507414
23 LOG_distance_commerical_bank 0.2107685 4.744543
24 LOG_distance_SACCOS 0.5813174 1.720231
25 LOG_distance_ATM 0.2093546 4.776584
26 LOG_distance_mobile_money 0.7544986 1.325384
27 joint_account_pct 0.9827299 1.017574
All between 1 and 5 suggests moderate correlation, which may not be problematic.
10.1 Test for Non-Linearity
ols_plot_resid_fit(sa_mlr1)
10.2 Variable selection
sa_fw_mlr <- ols_step_forward_p(
sa_mlr1,
p_val = 0.05,
details = FALSE)plot(sa_fw_mlr)
10.3 Visualising model parameters
ggcoefstats(sa_mlr1,
sort = "ascending")
10.4 Test for Normality Assumption
The code chunk below uses ols_plot_resid_hist() of olsrr package to perform normality assumption test.
ols_plot_resid_hist(sa_mlr1)
The figure reveals that the residual of the multiple linear regression model (i.e. sa.mlr1) is resemble normal distribution.
For formal statistical test methods, the ols_test_normality() of olsrr package can be used as shown in the code chun below.
ols_test_normality(sa_mlr1)Warning in ks.test.default(y, "pnorm", mean(y), sd(y)): ties should not be
present for the one-sample Kolmogorov-Smirnov test
-----------------------------------------------
Test Statistic pvalue
-----------------------------------------------
Shapiro-Wilk 0.5989 0.0000
Kolmogorov-Smirnov 0.3139 0.0000
Cramer-von Mises 1052.2548 0.0000
Anderson-Darling 427.5649 0.0000
-----------------------------------------------
The summary table above reveals that the p-values of the four tests are way smaller than the alpha value of 0.05. Hence we will reject the null hypothesis and infer that there is statistical evidence that the residual are not normally distributed.
10.5 Testing for Spatial Autocorrelation
In order to perform spatial autocorrelation test, we need to convert uganda_data.sf from sf data frame into a SpatialPointsDataFrame.
mlr.output <- as.data.frame(sa_mlr1$residuals)Next, we will join the newly created data frame with uganda_data.sf object.
uganda_data.res.sf <- cbind(uganda_data.sf,
sa_mlr1$residuals) %>%
rename(`MLR_RES` = `sa_mlr1.residuals`)The code chunk below will be used to perform the data conversion process.
uganda_sa.sp <- as_Spatial(uganda_data.res.sf)
uganda_sa.spclass : SpatialPointsDataFrame
features : 3176
extent : -3395506, 718004.1, 9843626, 10407772 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=36 +south +datum=WGS84 +units=m +no_defs
variables : 63
names : HH_ID, Region, Rural_Urban, age_band, gender, education_level, employment_status, mobile_user, national_ic_doc, passport_doc, utilities_bill_doc, pay_slip_doc, self_sustaining, financial_advice, save_money, ...
min values : 001001, CENTRAL, Rural, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
max values : 321087, WESTERN, Urban, 7, 2, 9, 99, 2, 2, 2, 2, 2, 2, 2, 2, ...
The code churn below will turn on the interactive mode of tmap.
tmap_mode("view")tmap mode set to interactive viewing
The code chunks below is used to create an interactive point symbol map.
tm_shape(boundaries_cleaned)+
tmap_options(check.and.fix = TRUE) +
tm_polygons(alpha = 0.4) +
tm_shape(uganda_data.res.sf) +
tm_dots(col = "MLR_RES",
alpha = 0.6,
style="quantile") +
tm_view(set.zoom.limits = c(11,14))Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")tmap mode set to plotting
10.6 Spatial stationary test
First, we will compute the distance-based weight matrix by using dnearneigh() function of spdep.
uganda_data_res_sf <- uganda_data.res.sf %>%
mutate(nb = st_knn(geometry, k=6,
longlat = FALSE),
wt = st_weights(nb,
style = "W"),
.before = 1)Next, global_moran_perm() of sfdep is used to perform global Moran permutation test.
global_moran_perm(uganda_data_res_sf$MLR_RES,
uganda_data_res_sf$nb,
uganda_data_res_sf$wt,
alternative = "two.sided",
nsim = 99)
Monte-Carlo simulation of Moran I
data: x
weights: listw
number of simulations + 1: 100
statistic = 0.016767, observed rank = 95, p-value = 0.1
alternative hypothesis: two.sided
11. Building Hedonic Pricing Models using GWmodel
11.1 Building Fixed Bandwidth GWR Model
In the code chunk below bw.gwr() of GWModel package is used to determine the optimal fixed bandwidth to use in the model. Notice that the argument adaptive is set to FALSE indicates that we are interested to compute the fixed bandwidth.
bw_fixed_sa <- bw.gwr(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data=uganda_data_res_sf,
approach="CV",
kernel="gaussian",
adaptive=FALSE,
longlat=FALSE)Take a cup of tea and have a break, it will take a few minutes.
-----A kind suggestion from GWmodel development group
Fixed bandwidth: 2546005 CV score: 0.07780339
Fixed bandwidth: 1573832 CV score: 0.07782599
11.2 GWModel method - fixed bandwith
Use the code chunk below to calibrate the gwr model using fixed bandwidth and gaussian kernel.
gwr.fixed_sa <- gwr.basic(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data = uganda_data_res_sf,
bw = bw_fixed_sa,
kernel = "gaussian",
longlat = FALSE)The output is saved in a list of class “gwrm”. The code below can be used to display the model output.
gwr.fixed_sa ***********************************************************************
* Package GWmodel *
***********************************************************************
Program starts at: 2024-11-15 02:35:54.207282
Call:
gwr.basic(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data = uganda_data_res_sf, bw = bw_fixed_sa, kernel = "gaussian",
longlat = FALSE)
Dependent (y) variable: savings_account_pct
Independent variables: passport_doc_pct pay_slip_doc_pct self_sustaining_pct save_money_pct save_money_commercial_bank_pct save_money_SACCO_pct LOG_last_amt_saved borrow_money_commercial_bank_pct borrow_money_SACCO_pct borrow_money_mobile_money_pct LOG_distance_commerical_bank
Number of data points: 3176
***********************************************************************
* Results of Global Regression *
***********************************************************************
Call:
lm(formula = formula, data = data)
Residuals:
Min 1Q Median 3Q Max
-0.032961 -0.000829 -0.000054 0.000397 0.017458
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0166977 0.0020516 8.139 5.68e-16 ***
passport_doc_pct 0.0468483 0.0150517 3.112 0.001872 **
pay_slip_doc_pct 0.0535268 0.0156865 3.412 0.000652 ***
self_sustaining_pct 0.0227272 0.0079988 2.841 0.004521 **
save_money_pct -0.0656226 0.0113683 -5.772 8.57e-09 ***
save_money_commercial_bank_pct 0.1834748 0.0105622 17.371 < 2e-16 ***
save_money_SACCO_pct 0.2193000 0.0100951 21.723 < 2e-16 ***
LOG_last_amt_saved 0.0005877 0.0001678 3.502 0.000468 ***
borrow_money_commercial_bank_pct 0.0953930 0.0226693 4.208 2.65e-05 ***
borrow_money_SACCO_pct 0.0955746 0.0200753 4.761 2.01e-06 ***
borrow_money_mobile_money_pct 0.0496369 0.0170359 2.914 0.003597 **
LOG_distance_commerical_bank 0.0006512 0.0002217 2.937 0.003334 **
---Significance stars
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.004901 on 3164 degrees of freedom
Multiple R-squared: 0.3141
Adjusted R-squared: 0.3117
F-statistic: 131.7 on 11 and 3164 DF, p-value: < 2.2e-16
***Extra Diagnostic information
Residual sum of squares: 0.07599864
Sigma(hat): 0.004893273
AIC: -24754.87
AICc: -24754.75
BIC: -27747.22
***********************************************************************
* Results of Geographically Weighted Regression *
***********************************************************************
*********************Model calibration information*********************
Kernel function: gaussian
Fixed bandwidth: 2546005
Regression points: the same locations as observations are used.
Distance metric: Euclidean distance metric is used.
****************Summary of GWR coefficient estimates:******************
Min. 1st Qu. Median
Intercept 0.01624078 0.01626142 0.01627810
passport_doc_pct 0.04425666 0.04665337 0.04678180
pay_slip_doc_pct 0.04617598 0.05344475 0.05384274
self_sustaining_pct 0.02110392 0.02360797 0.02361127
save_money_pct -0.06814142 -0.06544134 -0.06534997
save_money_commercial_bank_pct 0.18247981 0.18272256 0.18289746
save_money_SACCO_pct 0.21497232 0.22035550 0.22047133
LOG_last_amt_saved 0.00057433 0.00057496 0.00057544
borrow_money_commercial_bank_pct 0.09597604 0.09797034 0.09822798
borrow_money_SACCO_pct 0.08112327 0.09800867 0.09871645
borrow_money_mobile_money_pct 0.04889972 0.04911912 0.04928691
LOG_distance_commerical_bank 0.00062156 0.00062261 0.00062350
3rd Qu. Max.
Intercept 0.01629940 0.0179
passport_doc_pct 0.04688121 0.0470
pay_slip_doc_pct 0.05410534 0.0545
self_sustaining_pct 0.02361922 0.0236
save_money_pct -0.06527789 -0.0652
save_money_commercial_bank_pct 0.18312609 0.1887
save_money_SACCO_pct 0.22054487 0.2206
LOG_last_amt_saved 0.00057604 0.0006
borrow_money_commercial_bank_pct 0.09862224 0.0993
borrow_money_SACCO_pct 0.09911163 0.0999
borrow_money_mobile_money_pct 0.04950421 0.0548
LOG_distance_commerical_bank 0.00062480 0.0007
************************Diagnostic information*************************
Number of data points: 3176
Effective number of parameters (2trace(S) - trace(S'S)): 13.15078
Effective degrees of freedom (n-2trace(S) + trace(S'S)): 3162.849
AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): -24755.9
AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): -24770.66
BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): -27857.41
Residual sum of squares: 0.07594065
R-square value: 0.3146122
Adjusted R-square value: 0.3117615
***********************************************************************
Program stops at: 2024-11-15 02:35:58.262614
The report shows that the AICc of the gwr is -24754.87 which is slightly smaller than the global multiple linear regression model of -24754.869.
11.3 Building Adaptive Bandwidth GWR Model
Calibrate the gwr-based hedonic pricing model by using adaptive bandwidth approach.
11.3.1 Computing the adaptive bandwidth
Use bw.gwr() to determine the recommended data point to use.
The code chunk used look very similar to the one used to compute the fixed bandwidth except the adaptive argument has changed to TRUE.
bw.adaptive_sa <- bw.gwr(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data=uganda_data_res_sf,
approach="CV",
kernel="gaussian",
adaptive=TRUE,
longlat=FALSE)Take a cup of tea and have a break, it will take a few minutes.
-----A kind suggestion from GWmodel development group
Adaptive bandwidth: 1970 CV score: 0.07684179
Adaptive bandwidth: 1225 CV score: 0.07640372
Adaptive bandwidth: 764 CV score: 0.07639378
The result shows that the 764 is the recommended data points to be used.
11.3.2 Constructing the adaptive bandwidth gwr model
gwr_adaptive_sa <- gwr.basic(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data=uganda_data.sf,
bw=bw.adaptive_sa,
kernel = 'gaussian',
adaptive=TRUE,
longlat = FALSE)The code below can be used to display the model output.
gwr_adaptive_sa ***********************************************************************
* Package GWmodel *
***********************************************************************
Program starts at: 2024-11-15 02:36:03.548228
Call:
gwr.basic(formula = savings_account_pct ~ passport_doc_pct +
pay_slip_doc_pct + self_sustaining_pct + save_money_pct +
save_money_commercial_bank_pct + save_money_SACCO_pct + LOG_last_amt_saved +
borrow_money_commercial_bank_pct + borrow_money_SACCO_pct +
borrow_money_mobile_money_pct + LOG_distance_commerical_bank,
data = uganda_data.sf, bw = bw.adaptive_sa, kernel = "gaussian",
adaptive = TRUE, longlat = FALSE)
Dependent (y) variable: savings_account_pct
Independent variables: passport_doc_pct pay_slip_doc_pct self_sustaining_pct save_money_pct save_money_commercial_bank_pct save_money_SACCO_pct LOG_last_amt_saved borrow_money_commercial_bank_pct borrow_money_SACCO_pct borrow_money_mobile_money_pct LOG_distance_commerical_bank
Number of data points: 3176
***********************************************************************
* Results of Global Regression *
***********************************************************************
Call:
lm(formula = formula, data = data)
Residuals:
Min 1Q Median 3Q Max
-0.032961 -0.000829 -0.000054 0.000397 0.017458
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0166977 0.0020516 8.139 5.68e-16 ***
passport_doc_pct 0.0468483 0.0150517 3.112 0.001872 **
pay_slip_doc_pct 0.0535268 0.0156865 3.412 0.000652 ***
self_sustaining_pct 0.0227272 0.0079988 2.841 0.004521 **
save_money_pct -0.0656226 0.0113683 -5.772 8.57e-09 ***
save_money_commercial_bank_pct 0.1834748 0.0105622 17.371 < 2e-16 ***
save_money_SACCO_pct 0.2193000 0.0100951 21.723 < 2e-16 ***
LOG_last_amt_saved 0.0005877 0.0001678 3.502 0.000468 ***
borrow_money_commercial_bank_pct 0.0953930 0.0226693 4.208 2.65e-05 ***
borrow_money_SACCO_pct 0.0955746 0.0200753 4.761 2.01e-06 ***
borrow_money_mobile_money_pct 0.0496369 0.0170359 2.914 0.003597 **
LOG_distance_commerical_bank 0.0006512 0.0002217 2.937 0.003334 **
---Significance stars
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.004901 on 3164 degrees of freedom
Multiple R-squared: 0.3141
Adjusted R-squared: 0.3117
F-statistic: 131.7 on 11 and 3164 DF, p-value: < 2.2e-16
***Extra Diagnostic information
Residual sum of squares: 0.07599864
Sigma(hat): 0.004893273
AIC: -24754.87
AICc: -24754.75
BIC: -27747.22
***********************************************************************
* Results of Geographically Weighted Regression *
***********************************************************************
*********************Model calibration information*********************
Kernel function: gaussian
Adaptive bandwidth: 764 (number of nearest neighbours)
Regression points: the same locations as observations are used.
Distance metric: Euclidean distance metric is used.
****************Summary of GWR coefficient estimates:******************
Min. 1st Qu. Median
Intercept 0.01100786 0.01328636 0.01456606
passport_doc_pct 0.02561207 0.03882329 0.04702325
pay_slip_doc_pct -0.01638144 0.00222473 0.04980080
self_sustaining_pct 0.00464580 0.01102064 0.01799666
save_money_pct -0.09221844 -0.07057591 -0.05649395
save_money_commercial_bank_pct 0.11863544 0.13547024 0.17265364
save_money_SACCO_pct 0.19076097 0.21804899 0.23073936
LOG_last_amt_saved 0.00035383 0.00047877 0.00051201
borrow_money_commercial_bank_pct -0.01082834 0.03301703 0.08165212
borrow_money_SACCO_pct 0.00749615 0.08501548 0.11901933
borrow_money_mobile_money_pct 0.01177161 0.02741277 0.05205928
LOG_distance_commerical_bank 0.00029778 0.00038608 0.00051759
3rd Qu. Max.
Intercept 0.01569714 0.0181
passport_doc_pct 0.05642495 0.0828
pay_slip_doc_pct 0.10485771 0.1731
self_sustaining_pct 0.02399954 0.0415
save_money_pct -0.04559956 -0.0353
save_money_commercial_bank_pct 0.21462934 0.2431
save_money_SACCO_pct 0.26313271 0.3168
LOG_last_amt_saved 0.00054552 0.0007
borrow_money_commercial_bank_pct 0.14267422 0.2249
borrow_money_SACCO_pct 0.14935916 0.2099
borrow_money_mobile_money_pct 0.08735646 0.1175
LOG_distance_commerical_bank 0.00066291 0.0008
************************Diagnostic information*************************
Number of data points: 3176
Effective number of parameters (2trace(S) - trace(S'S)): 42.64019
Effective degrees of freedom (n-2trace(S) + trace(S'S)): 3133.36
AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): -24895.67
AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): -24929.82
BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): -27883.62
Residual sum of squares: 0.07180195
R-square value: 0.3519653
Adjusted R-square value: 0.3431437
***********************************************************************
Program stops at: 2024-11-15 02:36:07.949516
11.3.4 Visualising GWR Output
To visualise the fields in SDF, we need to first covert it into sf data.frame by using the code chunk below.
gwr_adaptive_output <- as.data.frame(
gwr_adaptive_sa$SDF) %>%
select(-c(2:15))gwr_sf_adaptive <- cbind(uganda_data.sf,
gwr_adaptive_output)Next, glimpse() is used to display the content of uganda_data.sf.adaptive sf data frame.
glimpse(gwr_sf_adaptive)Rows: 3,176
Columns: 92
$ HH_ID <chr> "001001", "001019", "001028", "001…
$ Region <chr> "NORTHERN", "NORTHERN", "NORTHERN"…
$ Rural_Urban <chr> "Urban", "Urban", "Urban", "Urban"…
$ age_band <dbl> 4, 4, 3, 4, 4, 1, 4, 5, 3, 2, 2, 4…
$ gender <dbl> 2, 2, 2, 1, 2, 1, 2, 1, 1, 2, 2, 1…
$ education_level <dbl> 6, 2, 1, 2, 3, 2, 3, 6, 2, 2, 3, 6…
$ employment_status <dbl> 1, 5, 5, 1, 4, 9, 1, 7, 1, 5, 5, 4…
$ mobile_user <dbl> 2, 1, 2, 2, 1, 2, 1, 2, 1, 1, 1, 1…
$ national_ic_doc <dbl> 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1…
$ passport_doc <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1…
$ utilities_bill_doc <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ pay_slip_doc <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ self_sustaining <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ financial_advice <dbl> 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1…
$ save_money <dbl> 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 2, 2…
$ save_money_commercial_bank <dbl> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2…
$ save_money_SACCO <dbl> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ save_money_mobile_money <dbl> 1, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2…
$ last_amt_saved <dbl> 4, 1, 9, 9, 1, 9, 1, 8, 9, 1, 9, 9…
$ last_amt_borrowed <dbl> 2, 1, 998, 998, 1, 998, 2, 998, 1,…
$ borrow_money_commercial_bank <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2…
$ borrow_money_SACCO <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ borrow_money_mobile_money <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2…
$ last_amt_sent <dbl> 3, 1, 998, 998, 1, 998, 1, 2, 1, 9…
$ last_amt_received <dbl> 1, 1, 998, 998, 998, 998, 998, 997…
$ own_insurance <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ distance_commerical_bank <dbl> 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 4, 4…
$ distance_SACCOS <dbl> 2, 2, 1, 2, 1, 1, 1, 2, 1, 2, 3, 2…
$ distance_ATM <dbl> 2, 2, 2, 4, 1, 1, 1, 2, 2, 2, 4, 4…
$ distance_mobile_money <dbl> 1, 2, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2…
$ savings_account <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ joint_account <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ county_name <chr> "LABWOR", "LABWOR", "LABWOR", "LAB…
$ LOG_age_band <dbl> 1.3862944, 1.3862944, 1.0986123, 1…
$ gender_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_education_level <dbl> 1.7917595, 0.6931472, 0.0000000, 0…
$ LOG_employment_status <dbl> 0.0000000, 1.6094379, 1.6094379, 0…
$ mobile_user_pct <dbl> 0.06297229, 0.03148615, 0.06297229…
$ national_ic_doc_pct <dbl> 0.03148615, 0.03148615, 0.03148615…
$ passport_doc_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ utilities_bill_do_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ pay_slip_doc_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ self_sustaining_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ financial_advice_pct <dbl> 0.03148615, 0.03148615, 0.06297229…
$ save_money_pct <dbl> 0.03148615, 0.03148615, 0.06297229…
$ save_money_commercial_bank_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ save_money_SACCO_pct <dbl> 0.06297229, 0.03148615, 0.06297229…
$ save_money_mobile_money_pct <dbl> 0.03148615, 0.06297229, 0.06297229…
$ LOG_last_amt_saved <dbl> 1.386294, 0.000000, 2.197225, 2.19…
$ LOG_last_amt_borrowed <dbl> 0.6931472, 0.0000000, 6.9057533, 6…
$ borrow_money_commercial_bank_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ borrow_money_SACCO_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ borrow_money_mobile_money_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_last_amt_sent <dbl> 1.0986123, 0.0000000, 6.9057533, 6…
$ LOG_last_amt_received <dbl> 0.000000, 0.000000, 6.905753, 6.90…
$ own_insurance_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ LOG_distance_commerical_bank <dbl> 0.6931472, 0.6931472, 0.6931472, 0…
$ LOG_distance_SACCOS <dbl> 0.6931472, 0.6931472, 0.0000000, 0…
$ LOG_distance_ATM <dbl> 0.6931472, 0.6931472, 0.6931472, 1…
$ LOG_distance_mobile_money <dbl> 0.0000000, 0.6931472, 1.0986123, 0…
$ savings_account_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ joint_account_pct <dbl> 0.06297229, 0.06297229, 0.06297229…
$ Intercept <dbl> 0.01374018, 0.01374214, 0.01373844…
$ CV_Score <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual <dbl> -0.1563762060, 1.4451162350, 0.059…
$ Intercept_SE <dbl> 0.002583321, 0.002583638, 0.002585…
$ passport_doc_pct_SE <dbl> 0.01850887, 0.01851101, 0.01852315…
$ pay_slip_doc_pct_SE <dbl> 0.02009748, 0.02010038, 0.02011371…
$ self_sustaining_pct_SE <dbl> 0.01040533, 0.01040718, 0.01041443…
$ save_money_pct_SE <dbl> 0.01375069, 0.01375226, 0.01376000…
$ save_money_commercial_bank_pct_SE <dbl> 0.01342778, 0.01342961, 0.01343743…
$ save_money_SACCO_pct_SE <dbl> 0.01397570, 0.01397852, 0.01399230…
$ LOG_last_amt_saved_SE <dbl> 0.0002015484, 0.0002015690, 0.0002…
$ borrow_money_commercial_bank_pct_SE <dbl> 0.02771849, 0.02772048, 0.02773078…
$ borrow_money_SACCO_pct_SE <dbl> 0.02956245, 0.02956804, 0.02959599…
$ borrow_money_mobile_money_pct_SE <dbl> 0.02012100, 0.02012268, 0.02013044…
$ LOG_distance_commerical_bank_SE <dbl> 0.0002695990, 0.0002696291, 0.0002…
$ Intercept_TV <dbl> 5.318805, 5.318912, 5.313990, 5.30…
$ passport_doc_pct_TV <dbl> 4.283297, 4.282622, 4.283510, 4.28…
$ pay_slip_doc_pct_TV <dbl> 6.106481, 6.106079, 6.109939, 6.11…
$ self_sustaining_pct_TV <dbl> 2.044626, 2.043120, 2.041662, 2.04…
$ save_money_pct_TV <dbl> -3.376885, -3.375175, -3.369971, -…
$ save_money_commercial_bank_pct_TV <dbl> 9.783221, 9.779953, 9.767750, 9.75…
$ save_money_SACCO_pct_TV <dbl> 15.48776, 15.48864, 15.47912, 15.4…
$ LOG_last_amt_saved_TV <dbl> 2.643905, 2.642684, 2.639057, 2.63…
$ borrow_money_commercial_bank_pct_TV <dbl> 0.3386409, 0.3377677, 0.3331410, 0…
$ borrow_money_SACCO_pct_TV <dbl> 6.741598, 6.739035, 6.732029, 6.72…
$ borrow_money_mobile_money_pct_TV <dbl> 0.8783295, 0.8780371, 0.8753335, 0…
$ LOG_distance_commerical_bank_TV <dbl> 2.582395, 2.582094, 2.581151, 2.57…
$ Local_R2 <dbl> 0.3668502, 0.3668682, 0.3669406, 0…
$ geometry <POINT [m]> POINT (572712.4 10295984), P…
$ geometry.1 <POINT [m]> POINT (572712.4 10295984), P…
summary(gwr_adaptive_sa$SDF$yhat) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.03960 0.06251 0.06308 0.06184 0.06352 0.06540
11.4 Visualising local R2
The code chunks below is used to create an interactive point symbol map.
tmap_mode("view")tmap mode set to interactive viewing
tmap_options(check.and.fix = TRUE)
tm_shape(boundaries_sf)+
tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +
tm_dots(col = "Local_R2",
border.col = "gray60",
border.lwd = 1) +
tm_view(set.zoom.limits = c(11,14))tmap_mode("plot")tmap mode set to plotting
11.5 Visualising coefficient estimates
The code chunks below is used to create an interactive point symbol map.
tmap_mode("view")tmap mode set to interactive viewing
passport_doc_pct_SE <- tm_shape(boundaries_cleaned)+
tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +
tm_dots(col = "passport_doc_pct_SE",
border.col = "gray60",
border.lwd = 1) +
tm_view(set.zoom.limits = c(11,14))
passport_doc_pct_TV <- tm_shape(boundaries_cleaned)+
tm_polygons(alpha = 0.1) +
tm_shape(gwr_sf_adaptive) +
tm_dots(col = "passport_doc_pct_TV",
border.col = "gray60",
border.lwd = 1) +
tm_view(set.zoom.limits = c(11,14))
tmap_arrange(passport_doc_pct_SE, passport_doc_pct_TV,
asp=1, ncol=2,
sync = TRUE)tmap_mode("plot")tmap mode set to plotting
The Bank for International Settlements has published a paper on Financial Inclusion which I also focus on Usage, Access, and Barriers. As more countries prioritize Anti-Money Laundering, developing nations may face challenges in obtaining the necessary documentation to verify their identity and land ownership. In a similar project I completed last semester, I suggested using satellite technology to monitor lenders’ plots of land. However, location remains a concern, and we recommended using non-smartphones to conduct these transactions, which has proven effective in India. Therefore, it would be beneficial to gather more information to ensure that financial inclusion can reach further.